Risk probability optimization problem for finite horizon continuous time Markov decision processes with loss rate
نویسندگان
چکیده
منابع مشابه
Simulation-Based Optimization Algorithms for Finite-Horizon Markov Decision Processes
We develop four simulation-based algorithms for finite-horizon Markov decision processes. Two of these algorithms are developed for finite state and compact action spaces while the other two are for finite state and finite action spaces. Of the former two, one algorithm uses a linear parameterization for the policy, resulting in reduced memory complexity. Convergence analysis is briefly sketche...
متن کاملFinite-horizon variance penalised Markov decision processes
We consider a finite horizon Markov decision process with only terminal rewards. We describe a finite algorithm for computing a Markov deterministic policy which maximises the variance penalised reward and we outline a vertex elimination algorithm which can reduce the computation involved.
متن کاملContinuous time Markov decision processes
In this paper, we consider denumerable state continuous time Markov decision processes with (possibly unbounded) transition and cost rates under average criterion. We present a set of conditions and prove the existence of both average cost optimal stationary policies and a solution of the average optimality equation under the conditions. The results in this paper are applied to an admission con...
متن کاملFinite-Horizon Markov Decision Processes with Sequentially-Observed Transitions
Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (or minimize costs) in a given stochastic dynamical environment. In this paper, we extend this model by incorporating additional information that the transitions due to act...
متن کاملFinite-Horizon Markov Decision Processes with State Constraints
Markov Decision Processes (MDPs) have been used to formulate many decision-making problems in science and engineering. The objective is to synthesize the best decision (action selection) policies to maximize expected rewards (minimize costs) in a given stochastic dynamical environment. In many practical scenarios (multi-agent systems, telecommunication, queuing, etc.), the decision-making probl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Kybernetika
سال: 2021
ISSN: ['1805-949X', '0023-5954']
DOI: https://doi.org/10.14736/kyb-2021-2-0272